A complexity reduction algorithm for analysis and annotation of large genomic sequences.

نویسندگان

  • Trees-Juen Chuang
  • Wen-Chang Lin
  • Hurng-Chun Lee
  • Chi-Wei Wang
  • Keh-Lin Hsiao
  • Zi-Hao Wang
  • Danny Shieh
  • Simon C Lin
  • Lan-Yang Ch'ang
چکیده

DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexity reduction algorithm for sequence analysis (CRASA) that enables direct alignment of cDNA sequences to the genome. This method features a progressive data structure in hierarchical orders to facilitate a fast and efficient search mechanism. CRASA implementation was tested with already annotated genomic sequences in two benchmark data sets and compared with 15 annotation programs (10 ab initio and 5 homology-based approaches) against the EST database. By the use of layered noise filters, the complexity of CRASA-matched data was reduced exponentially. The results from the benchmark tests showed that CRASA annotation excelled in both the sensitivity and specificity categories. When CRASA was applied to the analysis of human Chromosomes 21 and 22, an additional 83 potential genes were identified. With its large-scale processing capability, CRASA can be used as a robust tool for genome annotation with high accuracy by matching the EST sequences precisely to the genomic sequences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A CAD System Framework for the Automatic Diagnosis and Annotation of Histological and Bone Marrow Images

Due to ever increasing of medical images data in the world’s medical centers and recent developments in hardware and technology of medical imaging, necessity of medical data software analysis is needed. Equipping medical science with intelligent tools in diagnosis and treatment of illnesses has resulted in reduction of physicians’ errors and physical and financial damages. In this article we pr...

متن کامل

Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information

Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...

متن کامل

Parallel Generation of t-ary Trees

A parallel algorithm for generating t-ary tree sequences in reverse B-order is presented. The algorithm generates t-ary trees by 0-1 sequences, and each 0-1 sequences is generated in constant average time O(1). The algorithm is executed on a CREW SM SIMD model, and is adaptive and cost-optimal. Prior to the discussion of the parallel algorithm a new sequential generation with O(1) average time ...

متن کامل

Time and Space Complexity Reduction of a Cryptanalysis Algorithm

Binary Decision Diagram (in short BDD) is an efficient data structure which has been used widely in computer science and engineering. BDD-based attack in key stream cryptanalysis is one of the best forms of attack in its category. In this paper, we propose a new key stream attack which is based on ZDD(Zero-suppressed BDD). We show how a ZDD-based key stream attack is more efficient in time and ...

متن کامل

Time and Space Complexity Reduction of a Cryptanalysis Algorithm

Binary Decision Diagram (in short BDD) is an efficient data structure which has been used widely in computer science and engineering. BDD-based attack in key stream cryptanalysis is one of the best forms of attack in its category. In this paper, we propose a new key stream attack which is based on ZDD(Zero-suppressed BDD). We show how a ZDD-based key stream attack is more efficient in time and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome research

دوره 13 2  شماره 

صفحات  -

تاریخ انتشار 2003